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(57) Abstract 

A system and method for providing a client-side ( 11 OA, 1 10B, 1 ION) multimedia program enables delaying the actual final production 
of the video until a user begins viewing the program, at which time the system or method creates the video dynamically. To effectively use 
the transmission bandwidth, the system, instead of sending fully finished video frames and audio tracks, sends various pieces of raw content 
and instructions on how to create the final program from the raw contents. Since the system does not produce the final program until a 
viewer watches it, the system allows changes to the program at any time, even during playback. Using an Elastic Edit Decision List (EDL) 
(SI 6), the system allows program customization including removing segments of shows already seen by the viewer, expanding segments 
in which the viewer previously expressed interest, and customizing a commercial content to the viewer's tastes. Allowing broadcasting 
high-quality television streams over low-bandwidth channels, the invention is advantageous over prior art techniques which transmit video 
and audio tracks in their entirety and are thus limited by the available bandwidth. 
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WO 00/10329 PCT/US99/18292 

CLIENT-SIDE DIGITAL TELEVISION AUTHORING SYSTEM 



BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

5 This invention relates generally to multimedia production, and more particularly 

to producing low bandwidth, high quality multimedia content programs in client-side 
systems. 

2. Description of the Background An 

As computer and media technology improves, there is an increasing need for 
10 sending video over various digital distribution media, such as corporate intranets, by 
modem over telephone lines point-to-point, and over the Internet. Transmitting high 
quality video over these media traditionally requires high bandwidths, without which the 
final product is often erratic and of low resolution for viewing. 

Current digital video technologies generally are associated with high data 
15 volumes and thus require high transfer rates. For example, the ITU-R 601 standard used 
in professional digital video production provides for an NTSC frame of 720 pixels x 486 
scan lines, with eight-bit 4:2:2 sampling of Y, R-Y, and B-Y color components. At sixty 
fields per second, this frame structure results in a 20 megabyte per second data stream, 
which far exceeds rates currently available in common networks. 
20 Presently, fast modems connected to the Internet transfer up to 56 kilobits per 

second. Transferring a high quality compressed (for instance, 5:1) ITU-R 601 signal 
would require a further compression of 750: 1 . Compression of this magnitude would 
render the underlying signal unrecognizable. Even a 30:1 compressed image stream, 
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which is relatively good quality, would require a further compression of 125: 1. Various 
alternative techniques, such as smaller picture sizes and slower frame rates, enable the - 
compressed video stream to be viewable, but these techniques degrade the images. 
Digital networks have similar limited bandwidth deficiencies. Examples of 
5 digital networks include modem-to-modem over the conventional telephone network, the 
Internet with modems, the Internet with an Integrated Services Digital Network (ISDN) 
line at one end, the Internet with ISDN lines at both ends, and a corporate 10 Base T 
Local Area Network (LAN). 

Latency presents additional problems for digital networks. The Internet 
10 architecture does not guarantee any particular transfer bandwidth to be continuous over 
time. Existing video streaming architectures attempt to solve this problem by filling a 
buffer in the client (receiver) side with future video and audio frames so that, when a gap 
occurs in the Internet transmission, video is played from the buffer with the hope that the 
transmission will resume before the buffer is emptied. However, the transmission often 
15 does not resume in time, which, as the frame rate drops, can cause a pause in the video 
stream accompanied by an audio glitch. The buffering techniques also require that the 
buffer be filled before the video program begins playing. Consequently, as the size of the 
buffer increases, the time delay before the video playback begins also increases. 
Additionally, avoiding glitches in playback requires that the buffer be re-filled before the 
20 next transmission interruption. 

Current Internet multimedia is a poor substitute for media content which has been 
available on television for decades. The World Wide Web (the "web") is a collection of 
millions of electronic documents consisting primarily of text and still images, linked 
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together and accessible by anyone with a suitable Internet browser. This huge database 
of information is typically navigated, or ''surfed." by using various search tools to find - 
the desired information. Surfing, or navigating, the Internet is referred to as an "active" 
task in which the user actively engages in, and indeed controls, the exploration and 
acquisition of information. 

In contrast, viewing traditional film and television programs is "passive," in that 
someone else (the storyteller, news playback engines, etc.) is conveying information to 
the viewer with rro involvement on the viewer's part (other than perhaps changing the 
channel to a different program). Both active and passive experiences are desirable, and 
much activity has been focused on attempting to create a compelling passive experience 
over the Internet. For example, "push" technologies in newer Internet software packaees 
allow users to subscribe to various "channels" offered by web sites. Multimedia-stvie 
presentations are then delivered ("pushed") for later viewing, without explicit requests for 
the data in the presentation. These multimedia animations currently consist largely of 
filled vector (outline) shapes and text, which animate on-screen. Although more 
interesting than static imagery, this animation technology is not as sophisticated as 
broadcast television graphics imagery, often moving over small areas of the screen, and 
at slow, non-uniform frame rates. Given that most consumers are used to watching high- 
quality video productions on television, passive multimedia animations will not achieve 
broad appeal in their current form. 

In light of the prior art deficiencies, what is needed is a multimedia system that 
provides traditional broadcast-quality context while, at the same time, enabling efficient 
transmission over a low bandwidth communication system. 
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SUMMARY OF THE INVENTION 
The present invention provides a multimedia system and method for producing 
client-side low-bandwidth television (LBTV) program. The invention first creates scripts 
of the programs in an authoring system and stores them on a server, which, upon 
5 receiving a request from a user to view a program, transfers the corresponding Elastic 
Edit Decision List (EDL) and a small amount of preliminary content to the user's client 
system. The EDL serves as the script and includes instructions for the client system to 
produce and orchestrate the program in real-time on the client's hardware. 

The client system then executes the EDL for viewing the program with as little 
1 0 delay as possible. Even if a portion of the program script is received before all of the 

content has been transferred, the client system plays that portion of the script. While the 
client system is running one program segment, the program searches for data for 
subsequent segments, which may be locally available in the client system or be 
transferred from the server. If the data is not available, then the program may ask the 
15 client system to generate the data, use another piece of data having similar functions, or 
perform other functions to keep displaying the program continuously and smoothly, 
including prolonging display of an image, redisplaying an image that has been displayed 
earlier, or playing some graphic animation. Further, while the program is running, the 
server sends in the background additional raw content that is not available in the client 
20 system. 

The invention, allowing producing the program and receiving raw program 
content in real-time, is advantageous over prior art techniques in which the final program 
including the video and audio tracks are transmitted in their entirety. The invention, to 
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effectively use transmission bandwidths, sends only a description of how torecreate a 
given scene, rather than sending every video frame. Consequently, the invention enables 
the use of low bandwidth media to broadcast high-quality imagery, at least as good as in 
a television broadcast. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. I is a schematic overview of a network comprising the authoring system and 
clients of the preferred embodiment of the invention; 

FIG. 2 is a block diagram illustrating the architecture of the authoring system: 
5 FIG. 3 is a schematic showing a timeline editor which is used to produce the 

event-driven storyboard for an LBTV program; 

FIG. 4 shows an exemplary storyboard representing an LBTV program created in 
accordance with the invention; 

FIG. 5 shows an LBTV program memory map created from the storyboard of 

10 FIG. 4; 

FIG. 6 is a schematic overview of an architecture of a client system in FIG. 1 ; 
FIG. 7 is a schematic showing a playback engine of a client system; and 
FIG. 8 is a flowchart illustrating the method of the invention from the step of 
creating the LBTV scripts through the step of playing the LBTV program. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 



Referring now to FIG. I , a schematic overview is shown of a network 100 of the 
preferred embodiment, which includes an authoring system 104, a server 108, and a 
plurality of client systems ('^clients") 1 1 OA to 1 ION, all of which interconnect via the 
5 Internet 114. Authoring system 104 enables generation of Low Bandwidth Television 
(LBTV) software program scripts, which include raw data and instructions to construct a 
"client-side" video/television program from raw data. 

In the preTerred embodiment, server 108 stores the scripts created by authoring 
system 104 and transmits the scripts to clients 1 10. Program scripts preferably reside as 
10 disk files on server 108. Other network distribution systems, such as corporate intranets, 
Local Area Networks (LANs), Wide .Area Networks (WANs), 10 base T networks, and 
terrestrial digital Radio Frequency (RF) broadcasting, or modem (point-to-point) 
distribution systems are also effective. 

When a viewer desires to watch a program, the viewer may cause the viewer's 
15 client 1 10 to contact server 108. Server 108 transfers the program script consisting of an 
Elastic Edit Decision List (EDL) file, along with a small amount of preliminary content 
to client 1 10. The Elastic EDL file serves as the script by which client 1 10 will 
orchestrate the final production of the video. Client 1 10 begins executing the EDL file 
with as little delay as possible. Additional raw content that is not already stored on client 
20 1 10 is sent from the server 108 in temporal order over the Internet 1 14. Once the EDL 
file begins executing, the program sequentially displays the media content according to 
the script. Unlike many conventional computer media systems, however, program 
execution does not stop and wait for the availability of content. If a given section of the 
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program occurs before ail of the content has been transferred, client 1 10 continues and 
plays that section of the program, typically using "stand-in" content, such as line 
drawings or substitute images. Elastic EDLs, developed by Play. Inc., a corporation 
having headquarters in Rancho Cordova, are an improvement over the traditional EDL 
that includes an inflexible list of video clip segments for producing the video in order of 
the segments. In contrast, an Elastic EDL is "elastic," that is. it contains information and 
instructions to create the LBTV program on the fly, including information on whether to 
adjust the program during displaying such as prolonging a video segment or substituting 
the segment with another segment, etc. Consequently, an Elastic EDL, when required to 
compensate for the bandwidth variations, allows scaling of the program. Further, an 
Elastic EDL, taking accounts a user's previous playing of a program, may contain a 
user's preferences, and allows subsequent playing based on these preferences. 

Although FIG. 1 illustrates a preferred embodiment in which the authoring system 
104 and client 1 10 are separated by the Internet 1 14, alternatively, a single application 
program may contain the authoring system 104. The application program is loaded by a 
user on client 1 10, which then can create, edit, and execute program scripts. In this 
alternative embodiment, server 108 may contain content which is utilized by the program 
scripts, or all necessary program content may be loaded onto client 1 10 as pan of the 
application program. In addition, program scripts creaied by one client 1 10 can be 
executed by other clients 1 10 in such an embodiment. 

Referring now to FIG. 2, a block diagram illustrates the architectural details of the 
authoring system 104. Authoring system 104 includes a data bus 201 connecting a 
Central Processing Unit (CPU) 204 to a plurality of memories, which store data and 
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application programs that run on CPU 204. These memories and applications include a 
timeline editor 208, a music sound track module 212, an audio effects module 216, a 
video content module 220, a video effects module 224, a text generation module 228, and 
a communication unit 236. 
5 Timeline editor 208 is used to generate an event-driven storyboard for an LBTV 

program from music sound tracks, audio effects, video, video effects, and text stored in 
respective modules 212, 216, 220, 224, and 228. Timeline editor 208 can also retrieve 
audio and video objects from a computer hard disk or other means such as a scanner or a 
drawing tool. To reduce the video clip size, timeline editor 208 divides a Ions video clip 
10 into short pieces of video tracks, and when transferring the data to client 1 10, server 108 
sends the entire audio tracks and these short pieces of video tracks. If all video tracks 
have not been received during display, client system 1 10, in accordance with instructions 
in the Elastic EDL, may replace the belated video tracks with cut-aways of camera-stand 
style still movies, animating graphics, or stock video clip footage. 
15 Music sound track module 212 contains information associated with music. For 

example, music sound track 212 defines how long a piece of music will be played, how 
loud the musical piece will be played, what music key the piece will be performed in, and 
whether the music will be looped for continuous playing. The music is preferably stored 
as MIDI or compressed .WAV files. 
20 Audio effects module 216 produces various audio effects, such as control of 

music tracks, Foley effects, and voice-overs. Audio effects module 216 can also mix 
multiple audio tracks and add special effects, including reverberation and delays in real- 
time. 
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Video content module 220 contains the video objects from which an LBTV 

program is created. These video objects are stored in a variety of forms such as still 

pictures, short video clips, animated computer graphics, bitmaps, or other movie files 

suitable for QuickTime movies. Video content module 220 also contains information 

about how the video objects are to be used, e.g.. whether a picture will be displayed in 

full size or half size, and whether the picture will be cropped or trimmed to exclude 

unwanted images. 

Video effects module 224 enables application of a variety of visual effects to 
picture images in an LBTV program. Examples of video effects include cuts, dissolves, 
fades to and from black, wipes, organic animating wipes, and digital video effects such as 
push, pull, and flying video planes. Additionally, video effects module 224 can emulate 
various conventional camera techniques used in traditional video production, such as 
slow camera pans and zooms across still images. Video effects module 224 also provides 
transitional smoothing from one scene to another, from one image still to another, and 
from one video clip to another. 

Text module 228 may be a conventional text processing program and is used to 
design and generate text to be displayed with the video.. Preferably, text module 228 
enables text to be created as an object and placed anywhere on a video page. 

Authoring system 104 uses communication unit 236 to transmit and receive 
from other systems, including server 108 and clients 1 10. Communication unit 236 is a 
conventional network communication device such as a router or modem. 

Referring now to FIG. 3, a block diagram illustrates the architecture of the 
timeline editor 208 that is used to produce the video storyboard for an LBTV program. 
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Timeline editor 20S includes a source element browser 308. an element library 3 12, a 
plurality of object property editors 320, a storyboard display 326, an LBTV script 
compiler 330, and a playback engine 334. Source element browser 308 enables a user to 
find video objects and bring them into timeline editor 208. These video objects may 
5 reside on an Internet web page, a CD-ROM, or a PC hard disk, but preferably are 

contained in the element library 3 12 connected to element browser 308. These video 
objects may also be provided from a scanner, a software video object drawing tool, or 
other means of providing video elements. Additionally, the video objects can be 
"dragged" and ''dropped" into timeline editor 208 for editing. 

10 Each type of video object is edited using a specific object editor 320 suitable for 

the particular type of object. For example, text objects are conventionally edited using a 
word processor, still pictures are generally edited using a "Paint" program, and so on. 
Although object editors 320 are shown as residing within timeline editor 208, object 
editors 320 could reside anywhere in authoring system 104, so long as they can be 

15 accessed by timeline editor 208. Video objects are displayed, preferably after having been 
edited, in a storyboard format using the storyboard display 304. A storyboard 
representation of an LBTV program is preferably displayed as a sequence of icons or 
pictures. Each icon in the sequence represents a script object. For example, a still image 
in the sequence might be represented by a thumbnail of the image. A music object 

20 (.WAV file) might be represented by a music symbol displayed with the name of the file. 
An animated sequence might be represented by a thumbnail image of the first frame of 
the sequence. LBTV compiler 330 generates an LBTV program script from the edited 
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storyboard. Playback engine 334 allows a user to review a program script created by 
timeline editor 208. 

Referring now to FIG. 4. a block diagram illustrates an exemplary storyboard 400 
of an LBTV program created in accordance with the invention. Storyboard 400 is event- 
driven and includes, for example, from time T 0 to time T5, a "fade from black/' a still 
image A, video effects for the still image A, a still image B, a virtual camera pan, and a 
still image C, respectively. 

Initially at time T 0 the program preferably starts by "fading from black" to still 
image A at T r Other colors and/or effects may substitute for the fade-from-black 
transition. The fading time is preferably synchronized to a real-time clock, and is 
independent of the hardware and/or graphic card of the client 1 10. At time T, still image 
A is displayed. At time T 2 video effects are added to enhance image A. Examples of the 
video effects applied to still image A include wipes, organic animating wipes, and digital 
video effects such as push, pull, and flying video planes. Other audio and musical effects 
may also be added. At time T r still image B is displayed, and like still image A, image B 
may be accompanied by video and audio effects. At time T 4 , a virtual camera pan is used 
to pan across image B. The virtual camera pan is a video effect in which the viewer is 
able to see only a portion of the still image being viewed. The portion of the image 
displayed to the viewer changes slowly, giving the viewer the perspective of either 
moving his eyes across a panoramic view or of having the image move relative to the 
viewer. The virtual camera pan effect at T 4 contains video data related to the panning 
paths so that the paths can be constructed in client 110. At time T5, image B is replaced 
by still image C. Since no transition video effect separates still images B and C (such as 
a fade or a wipe), image C merely replaces image B. 

12 
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As the FIG. 4 storyboard display 304 executes, each subsequent image, sound, or 
video effect is loaded by playback engine 334 for execution. For a variety of reasons ~ 
including a large file size, network (such as Internet 1 14) latency, or resource congestion, 
the image or sound may not be immediately available for playback. When images are 
5 unavailable, programmed playback engine 334 continues executing the storyboard by 
substituting some alternative content for the missing image. For example, if at time T, 
image B is not available to replace the still image A of time T., the program may continue 
to display still image A. Alternatively, a line drawing or stock photograph may be 
substituted for the image A. Another alternative can be that playback engine 324 loops to 

10 groups of images that have been displayed earlier. In the storyboard 400 the invention 
continuously redisplays events from time T, to time T 4 that include still image A, video 
effects for image A, still image B, and virtual camera panning for image B. Other 
options to keep the program running continuously include looping the music, looping the 
graphics animation, and replaying parts of the storyboard. The invention can also 

15 substitute stand-in video content, change picture color to black and white, or substitute 
the transitions from one image to another image. In any event, as with traditional 
television, the programming must continue to play. This is in contrast to more traditional 
multimedia systems where the playback system generally stops executing and waits for 
the missing image video or audio clip to load before playback resumes. 

20 Referring now to FIG. 5, a block diagram illustrates an LBTV program memory 

map 500 containing the program script created from the storyboard 400 of FIG. 4. In the 
program script, a set of parameters is associated with each FIG. 4 time event. These 
parameters are stored in memory sections 504, 508, 510, 512, 516, and 520 for events at 
time T 0 , Ti, T 2 , T 3 , T 4 , and T 5 , respectively. In section 504, the fade from black event 



13 



WO 00/10329 PCT/US99/1829 
includes biaek as the initial color, and one second as the minimum duration time for 
fading. In section 508, image A is stored as a compressed bitmap. Additional parameters 
appended to image A indicate that image A will be displayed for a minimum of five 
seconds, and has a cropping value of (X u Y it X 2 , Y 2 ). Image A also has a zoom factor Z. 
Although in the embodiment shown in FIG. 5, the images (including still ima^e A) are 
shown stored in sequential memory, memory section 508 in fact preferably points to an 
image library where image A is actually stored. 

Section 510 contains the video effect occurring at time T 2 . The video effect is 
described as having a video effects speed of one, with still image A traveling from left 
(L) to right (R), and image A having a red border- 
Section 5 12 contains information for compressed image B. Image B is described 
as being displayed for ten seconds. Following this ten second display of image B, a 
virtual camera pan is described in section 5 16 as occurring with respect to image B. As 
previously described, the virtual camera pan is a pre-programmed effect, which in this 
case preferably results in a left to right .sweep of the image across the video screen. The 
virtual camera pan of section 516 indicates that no zoom of the still image B will occur. 
Memory section 520 stores the still image C described with reference to FIG. 4 as being 
displayed at time T 5 . In the program script corresponding to the storyboard 400 of FIG. 
4, still image C is the last element. 

Referring now to FIG. 6, a block diagram illustrates the preferred architecture of 
FIG.l client 1 10, which can be a standard personal computer (PC), an LBTV compatible 
set-top box, or other system capable of executing software and displaying multimedia 
content. Since LBTV programs are actually created in a high-level description language, 
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the program's actual content can be generated in real-time to fit the capabilities of the 
client i 10 system. In the case of a PC, client 1 10 is preferably a Pentium computer with, 
a PCI video display card. The real-time playback of an LBTV program preferably does 
not require a graphics card with any special hardware capabilities. In the preferred 
5 embodiment, client 1 10 works well on a PC. without compromising the broadcast 
television mandates of smooth, continuous, and well-produced content. LBTV- 
compatible set-top boxes also preferably use standard PC display technologies, resulting 
in a low cost, stand-alone unit. In both PC and set-top configurations, the LBTV 
architecture preferably combines the standard PC display chipsets, a high-speed, general 

10 purpose CPU, and flexible software application modules to maintain a professional 
quality television program, even over a low bandwidth connection. 

Client 1 10 includes a client CPU 604 connected via data bus 601 to a plurality of 
memories which store data and application programs that run on the CPU 604, including 
a network browser 608, a playback engine 6I2 ? a script library 616, a video library 620, 

15 an algorithmic video and audio module 628, an audio library 634, and a keying and 
layering engine 638. A display 640 and a communication unit 642 are also 
conventionally coupled to data bus 601. 

Internet browser 608 allows a user to request a stored LBTV program, such as 
from a web site, and to download various media resources (images, effects, video clips, 

20 etc.) that might be required. Playback engine 612 provides intelligence for an LBTV 

script to be executed for viewing. In the preferred embodiment, playback engine 612 of 
FIG. 6 is identical to playback engine 334 of FIG. 3 discussed with reference to timeline 
editor 208 of authoring system 104. In one alternative embodiment, playback engine 334 
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has a different user interface or diagnostics than the client playback engine 612, as mi<*ht 
be contemplated for an editor versus a run-time tool. Script library 616 stores the 
program scripts that have been created and transferred to client 1 10. Each program script 
includes the Elastic EDL and some small preliminary content. Since the invention does 
not produce a client-side LBTV program until a viewer watches it, the program may be 
modified at any time, including during playback. The invention, using instructions in the 
Elastic EDL. may. in real-time, customize a program for the viewer, including removing 
segments of shows already seen by the viewer, expanding segments which include topics 
the viewer previously expressed interest in, and customizing a commercial to the 
viewer's tastes. The invention may also produce, in real-time, video edits, special effects, 
titles, graphics, and audio streams. 

When a viewer starts playing a program, using a script stored in the library 616, 
additional raw content that is not on client 1 10 is sent from server 108 in temporal order. 
Audio and video objects, normally transferred from server 108 to create the LBTV 
program, are stored in audio library 634 and video library 620, respectively. Pre- 
generated canned video elements such as borders, shadows, etc., are also stored in video 
library 620. When a program seeks to display an audio or video object, the program 
searches for that object in the appropriate library 634 or 620. Algorithmic video and 
audio module 628 is used to generate video images based on a defined algorithm 
provided by playback engine 612. Keying and layering 638 provides the capabilities for 
a viewer to see one picture layered on another picture, or one video object superimposed 
over a background. Client 1 10 uses communication unit 642 to transmit and receive data 
from server 108. 
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Referring now to FIG. 7, a block diagram illustrates the playback engine 612 
which orchestrates the final production of an LBTV program. Playback engine 612 
includes a speed coordinator 704, a synchronization module 708, and a transitional 
smoothness module 712. Speed coordinator 704 is responsible for ensuring that the 
5 LBTV program scripts run equivalently on both slow (PC 486) and faster (PC Pentium) 
machines. A program that runs on a faster machine does not necessarily complete its 
tasks in a shorter time, but the program's audio and video objects are usually of better 
quality. Further, a program created for a faster machine, using a mapping technique, can 
run on a slower machine. For example, when running on faster machines a program uses 
10 a 24-bit color representation for higher-quality video, but will map to an 8-bit color 

representation when running on slower machines. Similarly, audio in slower machines 
can also be mapped using fewer bits than the normal 16-bit representation for faster 
machines. Speed coordinator 704 first determines the speed at which playback engine 
612 is running, and, based on this speed, takes actions accordingly. The speed of 
15 playback engine 612 varies, depending on the speed of other system components, such as 
graphic card, audio card, system mother board, etc. 

Synchronization module 708 establishes synchronization points to synchronize 
different audio and video tracks that form the LBTV program. In between each two 
synchronization points the tracks may not synchronize, but at each synchronization point 
20 the tracks re-synchronize. In the preferred embodiment, one synchronization point is 

close enough to an adjacent point so that a program viewer cannot perceive that the tracks 
do not synchronize. Additionally, at each synchronization point, module 708 waits to 
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acquire all the data required for the next program segment before allowing the segment to 
be displayed. 

Transitional smoothness module 712 provides the high-quality, smooth and 
continuous appearance of a conventional television program. While the program is 
waiting to acquire all necessary data, module 712 uses several techniques to "mask out" 
the wait state so that a viewer perceives a smooth and continuous program. Module 712 
can instruct the program to fade out, prolong playing on one video/audio track, loop on 
one track continuously, or replay part of a track. Module 712 can also use an animation 
technique allowing some video objects to move around display 640 while the program is 
waiting. This is done using the capabilities provided by the keying and layering module 
638. Substitution can also be used. For example, if a program cannot play a desired 
musical instrument because the instrument is not available on the client 1 10 at the time 
needed, transition smoothness module 712 can replace the desired instrument with 
another similar instrument that is available on client 1 10. Similarly, if module 712 
requires a video that is not available, module 712 can request that algorithmic video 
module 628 create a video replacement based on algorithms defined by module 712. 

Referring now to FIG. 8, a flowchart illustrates the operation of the invention 
from the step of creating an LBTV program script through the step of playing the 
program. The method begins with creating the LBTV scripts in step 804. An LBTV 
program producer uses the authoring system 104 to generate the program scripts. Within 
the authoring system 104, timeline editor 208 is used to edit and add musical, audio, and 
video effects. A storyboard is created in which objects are sequentially arranged to 
produce the program script. Objects, represented by graphical icons within the 
storyboard, can be manipulated by dragging and rearranging the icons. The objects can 
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be moved to and from object libraries (e.g. music soundtrack 212, video coriftent 220, 
video effects 224 (FIG. 2)) and downloaded from remote servers 108. A script compiler 
330 converts the storyboard to an executable program file. 

As described for the model shown in FIG. 1, various LBTV scripts may be 
5 created and in step 808 are transferred to server 10S. Server 108 preferably sets up a web 
page and makes the scripts available to a viewer having access to the Internet 1 14. The 
web page of server 108 also preferably includes instructions for a viewer to request and 
download a desired LBTV program. Each LBTV program script can also be represented 
as a picture icon on the web page. 

10 In step 812 a viewer uses an Internet browser 608 to request a program for 

viewing. In the preferred embodiment, the viewer contacts the server 108. which then iff 
step 8 1 6 transfers the Elastic EDL along with a small amount of preliminary content. 
The EDL serves as the script to which the client system 1 10 orchestrates the Final 
production of the LBTV video on the client 1 lO's hardware. The invention, to effectively 

15 use the transmission bandwidth, sends only a description of how to recreate a given 
scene, rather than sending every video frame. 

In step 820 client system 1 10 starts executing the Elastic EDL for viewing the 
program with as little delay as possible. The Elastic EDL contains information on 
whether there are sufficient content segments to start playing the program. The delay 

20 depends on the system bandwidth and speed. Even if a given section of the program 

script occurs before all of the content has been transferred, client 1 10 continues and plays 
that section of the script, typically using "stand-in" content, such as line drawings, stock 
photographs, etc. While client 1 10 is running one program segment, the program in step 
821 is searching for data for subsequent segments. The data may be transferred from 
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server 108, or be locally available in the client 1 10 s storage disk including video library 
620 or audio library 634. Additionally, in step 822. while the program is running, server 
108 in the background sends additional raw content that is not available in client 1 10. 

If in step 824 the program finds data available, the data is retrieved in step 826 
and the program continues to step 828. If the data is not available in step 824, the 
program has several options. One option is that the program can substitute another piece 
of data having a similar function. For example, if a section of audio requires a 
synthesized piano voice and the piano voice is not immediately available, a pipe organ or 
similar available instrument may be substituted. As such, step 836 determines whether 
the substitute data is acceptable or feasible. When substitution is feasible, the program in 
step 840 retrieves and provides the substitute data for the program to continue in step ^ 
828. 

However, if in step 836 substitution is not feasible, the program in step 838 
requests algorithmic video and audio module 628 to generate data suitable for the next 
program segment. In step 844, the program determines whether data from module 628 is 
available. If so, the program in step 848 then retrieves the data from module 628 and 
continues in step 828. If the program, pursuant to step 844, determines that such data is 
not available, the program performs other functions in step 852. These functions include, 
for example, prolonging display of an image, redisplaying an image that has been 
displayed earlier, or playing some graphics animation until the desired data is completely 
transferred from server 108. The program in step 828 continues to fetch data and run 
until the program is complete. 

The invention has been explained above with reference to a preferred 
embodiment. Other embodiments will be apparent to those skilled in the art after reading 
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this disclosure. Therefore, these and other variations upon the preferred embodiment are 
intended to be covered by the appended claims. 
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WHAT IS CLAIMED IS: 



1 L A method for producing a multimedia program for viewing in a client system, 

2 comprising the steps of: 

3 storing on a server a plurality of scripts each corresponding to a respective one of 

4 said programs; 

5 transferring the script corresponding to one said program to said client system: 

6 viewing said transferred program at said client system; 

7 sending additional data in background to said client system while said program is 
S being viewed; 

9 retrieving said additional data by said client system; and 

10 incorporating said additional data into said program as said program is being 

1 1 viewed at said client system. 

1 2. The method of claim 1 wherein said program is substantially equivalent to a 

2 program selected from a group consisting of a television program, a QuickTime movie 

3 program, and a high-resolution imagery program. 

1 3. The method of claim 1 wherein the step of sending comprises the steps of: 

2 providing a video clip including a plurality of video tracks;. 

3 providing an audio clip corresponding to said video clip and including an audio 

4 track; 

5 sending said audio track to said client system; and 

6 sending said video tracks in a piecemeal manner to said client system. 
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1 4. The method of claim I further comprising the step of sending instructions on how 

2 to create said transferred program. 

1 5. The method of claim 4 wherein said instructions are in the form of an Elastic Edit 

2 Decision List. 

1 6. The method of claim 1 further comprising the step of. after the step of 

2 transferring, producing said program in reaJ-time. 

1 7. The method of claim 6 further comprising the step of modifying said program in 

2 real-time. ^ 

1 8. The method of claim 7 wherein the step of modifying is selected from a group 

2 consisting of the steps of removing a segment of said program, expanding a segment of 

3 said program, and customizing contents of said program. 

1 9. The method of claim 6 comprising the further step of, while a program segment of 

2 said program is being viewed, searching for data for subsequent segments. 

1 10. The method of claim 9 wherein said data for subsequent segments is selected 

2 from a group consisting of data in said server, data in said client system, data generated in 

3 real-time, data having a similar function, animation data, and data that has been displayed 

4 earlier. 
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1 II. The method of claim 1 further comprising the step of sending a description of 

2 how to create a scene in said transferred program. 

1 12. The method of claim 1 further comprising the step of using a communication 

2 connection selected from a group consisting of a low-bandwidth connection, a point-to- 

3 point modem, a cable modem, the Internet, a local area network, a wide are network, an 

4 intranet, and an terrestrial digital radio frequency band. 

1 13. The method of claim 1 wherein said scripts constitute an events-driven storyboard. 

1 14. The method of claim 1 comprising the further step of requesting for viewing said" 

2 program. 

1 15. A client-side multimedia-program authoring system, comprising: 

2 means for storing an authoring system script on a server; 

3 means for transferring said script to a client; 

4 means for viewing said program at said client; 

5 means for sending additional data in the background while said program is being 

6 viewed; 

7 means for retrieving said additional data; and 

8 means for incorporating said additional data into said program as said program is 

9 being viewed. 



24 



WO 00/10329 PCT/US99/18292 

1 16. A computer-readable medium storing codes to cause a computer to produce a 

2 multimedia program for viewing in a client system, by the steps of: 

3 storing on a server a plurality of scripts each corresponding to a respective one of 

4 said programs; 

5 transferring the script corresponding to one said program to said client system; 

6 viewing said transferred program at said client system; 

7 sending additional data in background to said client system while said program is 

8 being viewed; 

9 retrieving said additional data by said client system; and 

10 incorporating said additional data into said program as said program is being 

I 1 viewed at said client system. 
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LBTV PRODUCERS CREATE THE SCRIPTS ON AUTHORING SYSTEM 

104 
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SCRIPTS ARE TRANSFERRED TO SERVER 108 
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A VIEWER REQUESTS AN LBTV PROGRAM 
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SERVER 108 TRANSFERS THE REQUESTED SCRIPT AND ELASTIC EDL 



VIEWER STARTS VIEWING 
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